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[57] ABSTRACT 

The system and method of the present invention uses a 
zero-crossing rate measurement in order to determine the 
initiation and/or termination of speech in an audio signal 
input. It is especially well suited for detecting the termina- 
tion of a telephone message in a telephone answering device. 
Specifically, a sample of the zero-crossing rate signal is 
determined by counting the number of consecutive speech 
samples required for the occurrence of a pre-defined number 
of consecutive zero-crossings. The resultant zero-crossing 
rate signal is smoothed and applied to a differentiator. A 
short-time magnitude integration is performed to measure 
the energy in the differentiated signal. The output of the 
magnitude integration is provided to a threshold detector 
which produces a sequence of decision values indicating the 
presence or absence of speech. Finally, the decision values 
are filtered to produce a more definitive sequence of final 
decision values. 

22 Claims, 5 Drawing Sheets 
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DETECTION OF TONAL SIGNALS The algorithm operates as follows. First, the average 

n „ xmAM magnitude signal is searched to determine a maximal inter- 

FIELD OF THE INVENTION val ry^j] w j tn the property that the average magnitude 

The present invention relates generally to the field of signal exceeds the larger magnitude threshold everywhere 

speech detection, and more specifically to an improved 5 on the interval. Second, the endpoints of the maximal 

system and method for detecting initiation and/or termina- interval are extended outward to points where the average 

tion of a speech message in a voice storage device or magnitude signal falls below the smaller magnitude 

telephone answering device. threshold, defining interval [C,D]. Third, the zero-crossing 

rate signal is consulted to possibly extend the endpoints even 

DESCRIPTION OF THE RELATED ART ^ further. Namely, in the zero-crossing rate signal, the 25 

Telephone answering machines are a fundamental artifact samples immediately to the left of (preceding) C are 

of the modem life-style. A fundamental problem connected searched. If the zero-crossing rate signal exceeds the zero- 

with answering machine performance is that of detecting the crossing rate threshold three or more times in the 25 

end of a message. Since the answering machine employs a sam P lcs > the start point C is moved to the location of the first 

finite storage media (tape or RAM), to record in-coming „ s " ch exceeding. Similarly, the furnish point D is condition- 

u ** * 1 *u * \l • u- 15 ally moved to the right, 

speech messages, it is essential that the answering machine ' , , . f , , . « « , <. 

be able to accurately detect the end of these messages. The Thus ' ll tbe algorithm disclosed by Rabiner & Schafer 

end of a message can occur in many ways, but the result is ap .f" enl ' y USeS the 0 . bserv i a,10D H < ha « * eech 18 

& r r . i /• p with higher zero-crossing rale and higher average magnitude 

nearly always some form of tonal sequence (i.e. sequence of (or en * } ^ back ^ ound Doise .Vs the algorithm of 

tones) or background noise (silence). For the sake of 20 Rabiner & Schafer unlikely to perform adequately in 

discussion, this end of message signal, which ensues upon situations where the background noise has power and zero- 

the conclusion of the speech signal, will be called the Cf0Ssi rate com arable to lhat of the signaL Thus 

termination signal. It is simple to distinguish silence from a em and method are needed whereb the initiation 

speech by the use of a simple energy measure. Background and/or terrninalion of a speech signal may be detected in a 

noise usually has much smaller power, and thus energy, than 25 noise environment where the noise ^ n0{ ne cessarily of low 

a speech signal. However, tonal signals, which represent the zer o-crossing rate or low energy. In particular, a system and 

most typical termination signal, contain high energy. Thus method are Qeeded whereb the termination 0 f speech may 

the energy measure fails as a general technique for distin- be detected m a telephone message, 
guishing speech from termination signals. 

The problem of detecting the end of a message is com- 30 SUMMARY OF THE INVENTION 

pounded by the fact that the nature of the tones is best The system and method of the present invention uses a 

assumed to be unknown. Dial tone is the most common zero-crossing rate measurement in order to determine the 

result, but this varies from country to country, and may even initiation and/or termination of speech in an audio signal 

vary across private branch exchanges (PBX's). Other signals input. The present invention is especially well suited for 

may also occur which may have an on-off cadence, and 35 detecting the termination of a telephone message in a 

which may contain a variety of frequencies. telephone answering device. Specifically, a sample of the 

It should be noted that the problem of detecting the zero-crossing rate signal is determined (a) by counting the 

termination of speech in an answering machine message is number of consecutive speech samples required for the 

part of the more general problem of detecting the initiation occurrence of a pre-defined number of consecutive zero- 

and termination (i.e. the endpoints) of speech in a noise 40 crossings, or (b) by counting the number of zero-crossings 

environment. One prior art endpoint detection system occurring in a pre-defined number of consecutive speech 

employs zero-crossing rate (ZCR) and short-time energy samples. The former calculation gives a zero-crossing 

measurements with statistically determined detection thresh- period and the later gives a zero-crossing rate. However the 

olds [Rabiner and Schafer, Digital Processing of Speech distinction is not significant to the present invention. The 

Signals, pages 130-133, published by Prentice -Hall, ISBN 45 resultant zero-crossing rate signal is smoothed and applied 

0-13-213603-1, TK7882.S65R3]. In particular, Rabiner & to a differentiator. An energy signal is then produced from 

Schafer disclose an algorithm for detecting the endpoints of the differentiated signal, by measuring the energy in the 

an isolated speech utterance which involves computing a differentiated signal over a moving window in time. This 

zero-crossing rate signal and an average magnitude signal energy measurement captures the amount of variation of the 

based on the signal of interest. The zero-crossing rate signal 50 zero-crossing rate signal. A short-time magnitude integration 

is calculated using a moving window with 10 millisecond is performed to measure the energy in the differentiated 

time -width: the number of zero-crossings in a 10 millisec- signal. 

ond window is reported as a measure of the local zero- Speech has a time-varying spectrum and hence also a 

crossing rate. Similarly the average magnitude signal is time-varying zero-crossing rate. Hence, while speech energy 

calculated using a moving window with a 10 millisecond 55 is present in the audio input, the energy measurements 

time-width: a weighted sum of the magnitudes (absolute should report large values. In contrast, the non-speech signal 

values) of samples in a window is reported as a measure of which ensues at the end of a telephone call after speech has 

local energy. terminated is a mixture of tones, multi-tones, and Gaussian 

The zero-crossing rate and average magnitude signals are noise, having a locally constant spectrum and thereby a 

assumed to contain no speech content during an initial 60 locally constant zero-crossing rate. Thus, when the speech 

training period. The zero-crossing rate signal and average signal is absent, the energy measurements should report 

magnitude signal samples during this training period are small values. By applying the energy measurements to a 

subjected to a statistical analysis to determine two different threshold detection device, the present invention produces a 

average magnitude thresholds and one zero-crossing rate sequence of decision values indicating the presence or 

threshold. The algorithm uses the two average magnitude 65 absence of speech. 

thresholds and the zero-crossing rate threshold to determine Furthermore, the present invention preferably includes 

the endpoints of a speech utterance in the signal of interest. filtering the sequence of decision values. By examining a 
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moving-window of K consecutive decision values, a tion is shown. The zero-crossing rate calculator 110 operates 
sequence of "final" decision values may be asserted. on the input signal to produce a zero-crossing rate signal. 
Namely, in each window the decision values which indicate The zero-crossing rate calculator 110 comprises a false- 
the presence of speech are counted. When the count exceeds crossing pre -filter 210 and a zero-crossing rate measurement 
a first threshold J, then a final decision is asserted indicating 5 unit 220. The false-crossing pre-filter 210 is coupled to the 
the presence of speech. Conversely, when the count is input 105. Also the false-crossing pre-filter 210 is coupled to 
smaller than a second threshold I, a final decision is asserted the zero-crossing rate measurement unit 220. The zero- 
indicating the absence of speech. crossing rate measurement unit 220 has an output which is 

BRIEF DESCRIPTION OF THE DRAWINGS „ C0Upled 10 ,he differentia « ion 110 

10 The false -crossing pre-filter 210 receives the input signal 

A better understanding of the present invention can be via tne input 105> and serves t0 map low amplitude input 
obtained when the following detailed description of the samples to zero. This pre-filtering eliminates spurious zero- 
preferred embodiment is considered in conjunction with the crossings due to noise, especially during the low level part 
following drawings, in which: of a dual tone beat ^ false-crossing pre-filter 210 operates 

FIG. 1 A is a block diagram of a speech signal detector 100 15 on each input sample to produce an output sample according 

according to the present invention; to the follow rule: if the absolute value of an input sample 

FIG. IB provides a motivation of the present invention by is smaller than a fixed threshold, the output sample is set to 

means of a zero-crossing rate signal depicted during a zero, else the output sample is equal to the input sample. The 

transition from speech to non-speech; output signal thereby produced is referred to the modified 

FIG. 2 is a block diagram of the zero-crossing rate 20 ioput signal, 

calculator 110 according to the present invention; The zero-crossing rate measurement unit 220 receives the 

FIG. 3 is a block diagram of the differentiation unit 120 modified input signal from the false -crossing pre-filter 210 

according to the present invention; and produces a zero-crossing rate signal. The zero-crossing 

FIG. 4 is a block diagram of the discriminator 130 „ rate si S nal comprises a sequence of ZCR samples. A ZCR 

according to the present invention; sam P le * calculated by counting the number of samples 

rrr* - ■ u t , . - nn 4 tU required for the occurrence of L successive zero-crossings in 

FIG. 5 is a speech storage device 500 according to the , ^ . T . , „ I, 

. r ° the input signal, where L is a system defined constant. Thus 

present invention; , „ , 

M . .. rii a ZCR sample actually measures the local zero-crossing 

FIG. 6 is a block diagram of a telephone answering device . * ¥¥ j- u *. - * 

j- „ .i. . • i period. However the distinction between zero-crossing rate 

600 according to the present invention; and 30 . - * • * • * c *u * • i 

. to " or anc * period is not significant for the present invention. In an 
FIG. 7 is a block diagram of a preferred embodiment of essentially equivalen t embodiment of the invention, a ZCR 
the speech signal detector 100 according to the present sample is calculated by c^ing the number of zero- 
invention, crossings which occur in a window of M successive samples 
DETAILED DESCRIPTION OF THE of the input signal, where M is a system defined constant. 
PREFERRED EMBODIMENT 35 Referring now to FIG. IB, a motivation of the present 
Referring now to FIG. 1A, a block diagram of a speech invention is provided by means of a zero-crossing rate signal 
signal detector 100 according to the preferred embodiment depicted during a transition from speech to non-speech, 
of the present invention is shown. The speech signal detector Notice that speech is associated with a time-varying zero- 
100 comprises an input 105, a zero-crossing rate calculator 40 crossing rate (ZCR), while the tonal signals and/or noise, 
110, a differentiation unit 120, a discriminator 130, and an which occur after the speech message, have relatively con- 
output 140. The zero-crossing rate calculator 110 is coupled stant zero-crossing rate. By performing a differentiation 
to input 105. The zero-crossing rate calculator 110 is also operation, the intrinsic variation (rate of change) of the 
coupled to the differentiation unit 120. The differentiation zero-crossing rate signal is exposed. Furthermore, by per- 
unit 120 is coupled to the discriminator 130. And the 45 forming a moving-window integration of the absolute value 
discriminator 130 is coupled to the output 140. (magnitude) of the differentiated signal, the variation in the 
An input signal is supplied to the speech signal detector zero-crossing rate is monitored on a continuous basis. A 
100 through input 105. In the preferred embodiment of the lar S e value for the magnitude integration indicates the 
invention, the input signal is a digitized telephone signal. presence of speech, and a small value indicates the absence 
The zero-crossing rate calculator operates on the input signal 50 °^ s P eecn - 

to produce a zero-crossing rate signal. A sample of the Referring now to FIG. 3, a block diagram of the differ- 
zero-crossing rate signal provides a measure of local zero- entiation unit 120 according to the present invention is 
crossing rate in the input signal. The zero-crossing rate presented. The differentiation unit 120 uses the zero- 
signal is provided to differentiation unit 120. The differen- crossing rate signal received from the zero-crossing rate 
tiation unit 120 uses the zero-crossing rate signal to calculate 55 calculator 110 to calculate a differentiated zero-crossing rate 
a differentiated zero-crossing rate signal. The differentiated signal. The differentiation unit 120 comprises a smoothing 
zero-crossing rate signal measures the variation (or rate of filter 310 and a differentiator 320. The smoothing filter 310 
change) of the zero-crossing rate signal. The differentiated is coupled to receive the zero-crossing rate signal from the 
zero-crossing rate signal is supplied to the discriminator zero-crossing rate calculator 110. Also the smoothing filter 
130. The discriminator 130 uses the differentiated zero- 60 310 is coupled to the differentiator 320. The differentiator 
crossing rate signal to determine the instantaneous presence has an output which is coupled to the discriminator 130. 
or absence of speech in the input signal. An output signal, The smoothing filter 310 operates on the zero-crossing 
reflecting the instantaneous presence or absence of speech in rate signal and produces a filtered zero-crossing rate signal, 
the input signal, is provided by discriminator 130 via output In the preferred embodiment of the invention, the smoothing 
140. 65 filter is an N-tap median filter (N=3). The purpose of the 
Referring now to FIG. 2, a block diagram of the zero- median filter is to remove outlying values from the zero- 
crossing rate calculator 110 according to the present inven- crossing rate signal. This type of filtering (a) increases the 
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smoothness of the zero-crossing rate signal when the input The threshold detector 420 compares the resultant 

signal has a constant spectrum (as occurs for tonal (integration) values comprising the detection signal to a 

sequences), and (b) leaves the zero-crossing rate signal fixed detection threshold R, and generates a sequence of 

relatively unchanged when the input signal is speech— since decision values. If a resultant value exceeds the threshold R, 

speech has a dynamic spectrum. 5 tDC corresponding decision value is assigned a symbol which 

rp« i „ • . • , ■ • i , . indicates the presence of speech. If the resultant value does 

The filtered zero-crossing rate signal is provided to the . l , . , , ™ ,t_ j- j • ■ i 

,. ff . r™ j-o- . iia ? j «r not exceed the threshold R, the corresponding decision value 

differentiator 320. The differentiator 320 performs a differ- . L ,i_L J t *i_L c i_ 

. . . » /• i « r . , is assigned a symbol which indicates the absence of speech, 

entiation operaUon on the filtered zero-crossing rate signal Iq ^ embodiffientj the detection threshold 

producing a differentiated zero-crossing rate signal. In the tfae value ? 0 The ^ ucncc of decision values is referred to 

preferred embodiment of the invention, the differentiator io ag a decisioQ sigQal ^ sigQal ^ supplied tQ (he 

performs a first difference for the sake of computational ^ na j decision unit 430. 

efficiency. However in alternate embodiments, any numeri- jta final dec j s i 0 n unit 430 uses the decision signal to 

cal differentiation algorithm may be employed, subject to produce a sequence of final decision values. To calculate the 

fundamental design constraints for computational efficiency final decision values, the final decision unit 430 employs a 

and accuracy. 15 moving window of K successive decision values from the 

Referring now to FIG. 4, a block diagram of the discrimi- decision signal. Namely, a final decision value is calculated 

nator 130 according to the present invention is shown. The by counting a number of the K successive decision values 

discriminator 130 uses the differentiated zero-crossing rate which indicate the absence of speech. If the resultant number 

signal to determine the instantaneous presence or absence of * lar g er than a ^ threshold J, then the final decision value 

speech in the input signal. An output signal, reflecting the 20 & assigned a symbol indicating the absence of speech. If the 

instantaneous presence or absence of speech in the input resultant number is less than a second threshold I, then the 

signal, is provided by discriminator 130 via output 140. The final decision value is assigned a symbol indicating the 

discriminator 130 includes a magnitude integration unit 410, presence of speech. The integers I and J are system defined 

a threshold detector 420, and final decision unit 430. The constants with I less than or equal to J. The use of two 

magnitude integration unit 410 is coupled to receive the 25 distinct thresholds adds some hysteresis to the final decision 

differentiated zero-crossing rate signal from the differentia- P rocess and in lhe Prevention of spurious changes. The 

tion unit 120. Also the magnitude integration unit 410 is sequence of final decision values is referred to as a final 

coupled to the threshold detector 420. The threshold detector decision signal. The final decision signal is asserted as the 

420 is coupled to the final decision unit 430, and the final 0Ut P m of the final decision unit 430 via output 140. 

decision unit 430 provides is coupled to output 140. 30 In the Preferred embodiment of the invention, the speech 

Uic magnitude integration unit 410 performs a short-time ^ nal d f QCt ™ 100 °P era f es as P art of a lele P h ° ne answerin S 

magnitude integration on the differentiated zero-crossing d ?™' In ^ case U 15 UD ^ 0TihnX 10 detect the tera ™ation 

rate signal. Thus, each output value from the magnitude of the speech message so as to conserve storage space in the 

integration unit 410 is computed by integrating the absolute memory media which stores the speech message. However 

i r *u j o- * ° . j ' . ° . . , 35 it essential that the answering machme capture the whole 

value of the differentiated zero-crossing rate signal over a , ~ , +™ 

v , r , .1 n i \ t iL speech message. Thus the speech signal detector 100 must 

corresponding window (oi length F samples). In the pre- r & . /• . , ■ 

ferred embodiment of the invention, the integral is per- S uard u a S ainst pre™tare/false detection of the end of the 

formed using the "leaky integrator" given by the transfer f eech messa f • Dec K rc ^8 th f e ™ lue of the first threshold 

function increases the probability of detecting the absence of 

40 speech. However increasing the value of threshold J 

j a decreases the probability of false detection of the absence of 

H(z) = -1^. speech. The value of J must be chosen to balance these 

z ~ a competing requirements. In the preferred embodiment, K is 

chosen to equal 20, J is chosen to equal 16, and I chosen to 

In other words, if y(n) represents the value of an integral as 45 equal 14. 

it accumulates through the sample window, and x(n) repre- Referring now to FIG. 5, a speech storage device 500 

sents the differentiated zero-crossing rate signal, the leaky according to the present invention is shown. The speech 

integration is governed by the recurrence relation storage device 500 comprises an input 105, speech signal 

detector 100 (of FIG. 1), memory media 510, and control 

50 line 520. The input 105 is coupled to the speech signal 

At the beginning of the sample window, the cumulative detector 100 and to memory media 510. The speech signal 

integral y(n) is initialized to zero. Then the recursive expres- detector 100 is coupled to the memory media 510 via control 

sion above is applied for every sample x(n) in the P-sample line 520. An input signal is supplied to the speech storage 

window. At the end of the sample window, the resultant device via input 105. It is assumed that at least a portion of 

value of the accumulated integral is reported as the output 55 the input signal contains a speech signal. The memory media 

value. The cumulative integral y(n) is then re-initialized to 510 is operable to store the input signal. The speech signal 

zero for the next sample window integration. The output of detector 100 is operable to detect the initiation/termination 

the magnitude integration unit 410, referred to as the detec- of the speech signal within the input signal as described 

tion signal, is fed to the threshold detector 420. above. The control line 520 is identical to the output 140 (of 

In an alternate embodiment of the invention, the integra- 60 FIG. 1) of the speech signal detector 100. The speech signal 

tion over a sample window referred to above is performed by detector 100 provides an output signal via control line 420 

an FIR filter. In this case, the output value is a weighted indicating initiation/termination of the speech signal, and the 

average of the absolute values of the samples in the sample output signal is used to control the storage of the input signal 

window. into the memory media 510. In particular, storage is enabled 

In yet another embodiment of the invention, the absolute 65 when the output signal indicates initiation of the speech 

value mentioned above is replaced by a square. In this case signal, and disables storage when the output signal indicates 

the output values comprise energy measurements. termination of the speech signal. 
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Referring now to FIG. 6, a block diagram of a telephone 
answering device 600 according to the present invention is 
shown. The telephone answering device 600 comprises an 
interface unit 610, a control unit 620, a speaker 630, a 
microphone 635, a control panel 640, speech signal detector 
100, and memory media 650. The interface unit 610 is 
coupled to a central office of an external telephone system 
via a telephone line 602. Interface unit 610 is coupled to 
control unit 620, speech signal detector 100 (as illustrated in 
FIG. 1, and described in detail above), speaker 630, micro- 
phone 635, and memory media 650. Control unit 620 is 
coupled to control panel 640. It is noted that control panel 
640 may comprise a graphical user interface (GUI) of a 
computer system (not shown). Control unit 620 is also 
coupled to speech signal detector 100 and memory media 
650. 

If a user of telephone answering device 600 does not 
answer an incoming telephone call within a predetermined 
number of ring signals, telephone answering device 600 
"answers'* the incoming telephone call. Answering the tele- 
phone call includes the telephone answering device 600 
simulating an "off-hook" condition. Telephone answering 
device 600 then transmits a pre-recorded outgoing voice 
message over telephone line 602. Telephone answering 
device 600 then stores a calling party's audible response 
(i.e., an incoming voice message) into memory media 650. 

Speech signal detector 100 receives a digitized telephone 
signal from interface unit 610, and provides to control unit 
620 a control signal which indicates the termination of the 
speech message (in the telephone signal input). The tele- 
phone answering device 600 disables storage when the 
control signal indicates termination of the speech message. 

Referring now to FIG. 7, a block diagram of a preferred 
embodiment of the speech signal detector 100 according to 
the present invention is presented. In this embodiment, the 
speech signal detector 100 comprises: a threshold input unit 
710; a functional block 720 which counts the number of 
samples for achieving a specified number of zero -crossings; 
a 3-tap median filter 730; a first difference operation 740; an 
absolute value calculation 750; a leaky integrator 760; and 
a block 770 which tests the detection signal and makes the 
vox (voice activity) decision. 

Threshold input unit 710 is identical to false crossing 
pre-filter 210 of FIG. 2. The function block 720, which 
counts the number of samples for achieving a specified 
number of zero-crossings, is identical to zero-crossing rate 
measurement unit 220 of FIG. 2. The 3-tap median filter 730 
is a realization of the smoothing filter 310 of FIG. 3. The first 
difference operation 740 is a realization of differentiator 320 
of FIG. 3. The absolute value calculation 750 and the leaky 
integrator 760 are together equivalent to the magnitude 
integration unit 410 of FIG. 4. The block 770, which tests the 
detection signal and makes the vox (voice activity) decision, 
is equivalent to a combination of the threshold detector 420 
and the final decision unit 430 of FIG. 4. 

Although the system and method of the present invention 
has been described in connection with the preferred 
embodiment, it is not intended to be limited to the specific 
forms set forth herein, but on the contrary, it is intended to 
cover such alternatives, modifications, and equivalents, as 
can be reasonably included within the spirit and scope of the 
invention as defined by the appended claims. 

I claim: 

1. A system for detecting initiation/termination of a 
speech signal for a speech storage device, the system com- 
prising: 

an input for receiving an input signal, wherein at least a 
portion of said input signal includes a speech signal; 



a zero-crossing rate calculator coupled to said input for 
computing a zero-crossing rate signal based upon said 
input signal; 

a differentiation unit coupled to said zero-crossing rate 
calculator which receives said zero-crossing rate signal 
from said zero-crossing rate calculator, wherein the 
differentiation unit is configured to perform a differen- 
tiation operation with respect to time to produce a 
differentiated zero-crossing rate signal; 
a discriminator coupled to said differentiation unit which 
receives said differentiated zero-crossing rate signal, 
wherein said discriminator comprises a magnitude inte- 
gration unit which is configured to integrate an absolute 
value of said differentiated zero-crossing rate signal to 
generate a series of resultant values, wherein said 
discriminator determines initiation/termination of said 
speech signal within said input signal based on the 
series of resultant values; 
wherein said discriminator generates an output signal 
indicating initiation/termination of said speech signal 
within said input signal, wherein said output signal is 
used to control storage of said speech signal. 

2. The system of claim 1, wherein said differentiation unit 
includes a smoothing filter, wherein said smoothing filter 
smoothes said zero-crossing rate signal and thereby pro- 
duces a filtered zero-crossing rate signal, wherein said 
differentiation unit performs said differentiation operation 
with respect to time on said filtered zero-crossing rate signal 
to produce the differentiated zero-crossing rate signal. 

3. The system of claim 2, wherein said smoothing filter 
comprises a median filter. 

4. The system of claim 2, wherein said differentiation unit 
calculates a first difference on said filtered zero-crossing rate 
signal to produce said differentiated zero-crossing rate sig- 
nal. 

5. The system of claim 1, wherein the input signal 
comprises a sequence of input samples, wherein said zero- 
crossing rate calculator includes a false-crossing pre-filter, 
wherein said false-crossing pre-filter modifies the input 
signal by assigning a zero value to an input sample if the 
absolute value of the input sample is below a pre-determined 
threshold, wherein said false-crossing pre-filter produces a 
modified input signal, wherein said zero-crossing rate signal 
is computed based on said modified input signal. 

6. The system of claim 1, wherein the input signal 
comprises a sequence of input samples, wherein said zero- 
crossing rate calculator generates a sequence of sample 
counts, wherein each sample count of said sequence of 
sample counts represents the number of said input samples 
required for the occurrence of L successive zero -crossings in 
said input signal, wherein L is a pre-defined positive integer, 
wherein said sequence of sample counts comprises said 
zero-crossing rate signal. 

7. The system of claim 1, wherein the input signal 
comprises a sequence of input samples, wherein said zero- 
crossing rate calculator generates a sequence of zero- 

55 crossing counts, wherein each zero-crossing count of said 
sequence of zero -crossing counts represents the number of 
zero-crossings occurring in M successive samples of said 
input signal, wherein M is a pre-defined positive integer, 
wherein said sequence of zero-crossing counts comprises 

60 said zero-crossing rate signal. 

8. The system of claim 1, wherein said magnitude inte- 
gration unit is configured to calculate each resultant value of 
said series of resultant values by integrating absolute values 
of P consecutive samples of said differentiated zero-crossing 
rate signal, wherein P is a system specified integer constant, 
wherein said series of resultant values comprises a detection 
signal; 
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wherein said discriminator further comprises a threshold 
detector coupled to said magnitude integration unit, 
wherein said threshold detector compares said resultant 
values comprising said detection signal with a thresh- 
old value, and generates a sequence of first decision 
values, wherein a first decision value indicates the 
presence of said speech signal if a respective resultant 
value exceeds said threshold, and wherein the first 
decision value indicates the absence of said speech 
signal if the respective resultant value does not exceed 
said threshold, wherein said sequence of first decision 
values comprises a first decision signal. 

9. The system of claim 8, wherein said discriminator 
operates on said first decision signal to produce a second 
decision signal, wherein said second decision signal com- 
prises a sequence of second decision values, wherein a 
second decision value is determined using K successive 
values of said first decision signal, wherein K is a pre- 
defined integer constant, wherein said discriminator deter- 
mines a number of said K successive values which indicate 
presence of said speech signal, and uses said number to 
determine said second decision value, wherein said second 
decision value indicates either presence or absence of said 
speech signal, wherein said second decision signal com- 
prises said output signal of said discriminator. 

10. The system of claim 1, wherein said system is 
comprised in a speech storage device, wherein said speech 
storage device receives and stores said input signal; 

wherein said speech storage device receives from said 
discriminator said output signal indicating initiation/ 
termination of said speech signal within said input 
signal, and uses said output signal to control storage of 
said input signal, wherein said speech storage device 
disables storage of said input signal when said output 
signal indicates termination of said speech signal, and 
enables storage of said input signal when said output 
signal indicates initiation of said speech signal. 

11. A method for detecting initiation/termination of a 
speech signal for a speech storage device, the method 
comprising: 

receiving an input signal, wherein at least a portion of said 

input signal includes a speech signal; 
calculating a zero-crossing rate signal based on said input 

signal; 

performing a differentiation operation with respect to time 
to generate a differentiated zero-crossing rate signal; 

integrate an absolute value of the differentiated zero- 
crossing rate signal in order to compute a series of 
resultant values; 

determining initiation/termination of said speech signal 
based on said series of resultant values, wherein said 
determining initiation/termination of said speech signal 
includes generating a control signal which indicates 
initiation/termination of said speech signal; 

wherein said control signal is used to control storage of 
said speech signal. 

12. The method of claim 11, wherein said performing a 
differentiation operation comprises: 

smoothing said zero-crossing rate signal and thereby 
producing a filtered zero-crossing rate signal; 

differentiating said filtered zero-crossing rate signal with 
respect to time in order to generate the differentiated 
zero-crossing rate signal. 

13. The method of claim 12, wherein said smoothing said 
zero-crossing rate signal comprises applying a median filter 
algorithm to said zero-crossing rate signal. 
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14. The method of claim 12, wherein said differentiating 
said filtered zero-crossing rate signal with respect to time 
comprises performing a first difference on said filtered 
zero-crossing rate signal. 

15. The method of claim 11, wherein said input signal 
comprises a sequence of input samples, wherein said calcu- 
lating a zero-crossing rate signal based on said input signal 
includes: 

modifying said input signal by assigning a zero value to 
an input sample if the absolute value of the input 
sample is below a pre-determined threshold, wherein 
said modifying produces a modified input signal; 

wherein said zero-crossing rate signal is based on said 
modified input signal. 

16. The method of claim 11, wherein the input signal 
comprises a sequence of input samples, wherein said calcu- 
lating a zero-crossing rate signal comprises generating a 
sequence of sample counts, wherein each sample count of 
said sequence of sample counts represents the number of 
said input samples required for the occurrence of L succes- 
sive zero-crossings in said input signal, wherein L is a 
pre-defined positive integer, wherein said sequence of 
sample counts comprises said zero-crossing rate signal. 

17. The method of claim 11, wherein the input signal 
comprises a sequence of input samples, wherein said calcu- 
lating a zero-crossing rate signal comprises generating a 
sequence of zero-crossing counts, wherein each zero- 
crossing count of said sequence of zero-crossing counts 
represents the number of zero-crossings occurring in M 
successive input samples of said input signal, wherein said 
sequence of zero-crossing counts comprises said zero- 
crossing rate signal. 

18. The method of claim 11, wherein said integrating the 
absolute value of the zero-crossing rate signal comprises 
computing each of the resultant values by integrating P 
consecutive samples of said differentiated zero-crossing rate 
signal, wherein P is a system specified integer constant, 
wherein said series of resultant values comprises a detection 
signal; 

wherein said determining initiation/termination of said 
speech signal based on said series of result values 
comprises comparing said resultant values comprising 
said detection signal with a threshold value, and gen- 
erating a sequence of first decision values, wherein a 
first decision value indicates the presence of said 
speech signal if a respective resultant value exceeds 
said threshold, and wherein the first decision value 
indicates the absence of said speech signal if the 
respective value does not exceed said threshold, 
wherein said sequence of first decision values com- 
prises a first decision signal. 

19. The method of claim 18, wherein said determining 
initiation/termination of said speech signal based on said 
differentiated zero-crossing rate signal further comprises: 

producing a sequence of second decision values using 
said first decision signal, wherein each second decision 
value is produced using a corresponding window of K 
successive first decision values from said first decision 
signal, wherein K is a pre-defined integer constant, 
wherein producing a second decision value comprises: 
determining a number of said K successive values 
which indicate presence of said speech signal; and 
using said number to determine said second decision 
value, wherein said second decision value indicates 
either presence or absence of said speech signal; 
wherein said second decision signal comprises said con- 
trol signal. 
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20. The method of claim 11, wherein said method operates 
in a speech storage device, the method further comprising: 

storing said input signal in response to said control signal 
indicating initiation of said speech signal; 

discontinuing said storing said input signal in response to 
said control signal indicating termination of said speech 
signal. 

21. A system for detecting termination of a speech mes- 
sage for a speech storage device, the system comprising: 

an input for receiving an input signal, wherein at least a 
portion of said input signal includes a speech message 
signal; 

a zero-crossing rate calculator coupled to said input for 
computing a zero-crossing rate signal based upon said 
input signal; 

a differentiation unit coupled to said zero-crossing rate 
calculator which receives said zero -crossing rate signal 
from said zero-crossing rate calculator, wherein the 
differentiation unit is configured to perform a differen- 
tiation operation with respect to time to produce a 
differentiated zero-crossing rate sign; 

a discriminator coupled to said differentiation unit which 
receives said differentiated zero-crossing rate signal, 
wherein said discriminator comprises a magnitude inte- 
gration unit which is configured to integrate an absolute 
value of said differentiated zero-crossing rate signal to 
generate a series of resultant values, wherein said 
discriminator determines termination of said speech 
message signal within said input signal based on the 
series of resultant values; 

wherein said discriminator generates an output signal 
indicating termination of said speech message signal, 
wherein said output signal is used to control storage of 
said speech message signal. 

22. A telephone answering device comprising: 
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an input for receiving an input signal, wherein at least a 
portion of said input signal includes a speech message 
signal; 

a memory media which receives and stores said input 
signal; 

a message-termination detector coupled to said input, and 
operable to determine termination of said speech mes- 
sage signal within said input signal, wherein said 
message-termination detector generates a control signal 
indicating termination of said speech message signal; 
wherein said telephone answering device discontinues 
storage of said input signal in said memory media in 
response to said control signal indicating termination of 
said speech message signal; 
wherein said message -termination detector comprises: 
a zero-crossing rate calculator coupled to said input for 
computing a zero-crossing rate signal based upon 
said input signal; 
a differentiation unit coupled to said zero-crossing rate 
calculator which receives said zero-crossing rate 
signal from said zero-crossing rate calculator, 
wherein the differentiation unit is configured to per- 
form a differentiation operation with respect to time 
to produce a differentiated zero-crossing rate signal; 
a discriminator coupled to said differentiation unit 
which receives said differentiated zero-crossing rate 
signal, wherein said discriminator comprises a mag- 
nitude integration unit which is configured to inte- 
grate an absolute value of said differentiated zero- 
crossing rate signal to generate a series of resultant 
values, wherein said discriminator determines termi- 
nation of said speech message signal within said 
input signal based on the series of resultant values. 
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